Recognizing Speech from Sim

نویسنده

  • Bhiksha Raj
چکیده

In this paper we present and evaluate factored methods for recognition of simultaneous speech from multiple speakers in single-channel recordings. Factored methods decompose the problem of jointly recognizing the speech from each of the speakers by separately recognizing the speech from each speaker. In order to achieve this, the signal components of the target speaker in each case must be enhanced in some manner. We do this in two ways: using an NMF-based speaker separation algorithm that generates separated spectra for each speaker, and a mask estimation method that generates spectral masks for each speaker that must be used in conjunction with a missing-feature method that can recognize speech from partial spectral data. Experiments on synthetic mixtures of signals from the Wall Street Journal corpus show that both approaches can greatly improve the recognition of the individual signals in the mixture.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robot Arm Performing Writing through Speech Recognition Using Dynamic Time Warping Algorithm

This paper aims to develop a writing robot by recognizing the speech signal from the user. The robot arm constructed mainly for the disabled people who can’t perform writing on their own. Here, dynamic time warping (DTW) algorithm is used to recognize the speech signal from the user. The action performed by the robot arm in the environment is done by reducing the redundancy which frequently fac...

متن کامل

Masking release and modulation interference in cochlear implant and simulation listeners.

PURPOSE To examine the effects of temporal and spectral interference of masking noise on sentence recognition for listeners with cochlear implants (CI) and normal-hearing persons listening to vocoded signals that simulate signals processed through a CI (NH-Sim). METHOD NH-Sim and CI listeners participated in the experiments using speech and noise that were processed by bandpass filters. Depen...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005